Answering Reading Comprehension Tests:N-gram and Multi-View Regression
نویسندگان
چکیده
Computer “reading” of natural languages is a growing research topic with numerous applications, ranging from areas of artificial intelligence to linguistics. A successful “reading” of natural languages can be interpreted in two ways: 1) extracting facts out of a passage and 2) capturing “meaning” of a text. We focus on the latter interpretation, defining “meaning” as is done in humans, namely, can the system answer the same reading comprehension questions that are given to early learners of language. We present two approaches to answering cloze questions, which are fill-in-the blank, multiple-choice reading comprehension questions. The first approach utilizes n-gram, a basic statistical methodology that captures the local context of a passage to predict an answer. The second approach uses Multi-View Regression (“MVR”), based on canonical correlation analysis (“CCA”), which captures both the local and broader context. In this statistical approach, the “meaning” of each word is represented as a real-valued state vector associated with each word occurrence. MVR is a dynamical belief net which, like a Kalman filter or Hidden Markov Model (“HMM”), estimates the state based on the preceding and following sequence of words and predicts the word to be “emitted” based on that state. While there has been much research on these statistical methodologies in the field of machine learning and on generation and evaluation of cloze questions in the field of natural language processing, the connection between the two hasn’t been explored much. Prior research in cloze questions suggests that proper context capturing leads to correct answers. We create a system that reads in a set of cloze questions and predicts two sets of answers with the two statistical methodologies that have different context capturing schemes, and then analyze the results. Because MVR captures both the local and broader context, it answers cloze questions better than n-gram, which only captures the local context.
منابع مشابه
Differences in the Use of Multiple-choice Test-taking Strategies by Iranian EFL Learners Regarding Reading Comprehension Ability
The study investigated differences in the use of multiple-choice test-taking strategies by Iranian EFL learners regarding reading comprehension ability. Reading is the most important academic language skill that receives the particular focus in second or foreign language teaching; tests are also regularly applied to assess academic performance. This paper sought to investigate differences in th...
متن کاملMEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension
Machine comprehension(MC) style question answering is a representative problem in natural language processing. Previous methods rarely spend time on the improvement of encoding layer, especially the embedding of syntactic information and name entity of the words, which are very crucial to the quality of encoding. Moreover, existing attention methods represent each query word as a vector or use ...
متن کاملThe Effect of Grammar vs. Vocabulary Pre-teaching on EFL Learners’ Reading Comprehension: A Schema-Theoretic View of Reading
This study was designed to investigate the effect of grammar and vocabulary pre-teaching, as two types of pre-reading activities, on the Iranian EFL learners’ reading comprehension from a schema–theoretic perspective. The sample consisted of 90 female students studying at pre-university centers of Isfahan. The subjects were randomly divided into three equal-in-number groups. They participated ...
متن کاملThe Effect of Lexical Collocational Density on the Iranian EFL Learners’ Reading Comprehension
The present study aims at investigating the effect of different levels of lexical collocational density on EFL learners’ reading comprehension. Eighty sophomore students with different levels of proficiency studying at Zand Institute of Higher Education in Shiraz, Iran were chosen from among eighty five learners based on their score distribution on a reduced TOEFL test constructed by Education...
متن کاملTHE IMPACT OF LINGUISTIC AND EMOTIONAL INTELLIGENCE ON THE READING PERFORMANCE OF IRANIAN EFL LEARNERS
Following innovations in intelligence and its radical changes from a unitary concept (IQ) to a multi-dimensional conceptualization, i.e. multiple intelligences and the need to design classroom activities based on the L2 learners’ cognitive styles, this study examined the impact of linguistic intelligence and emotional intelligence on the reading comprehension ability of the Iranian EFL learners...
متن کامل